Methods of predicting ancestral virus sequences and uses thereof

ABSTRACT

Methods are described for predicting ancestral sequences for viruses or portions thereof. Also described are predicted ancestral sequences for adeno-associated virus (AAV) capsid polypeptides. The disclosure also provides methods of gene transfer and methods of vaccinating subjects by administering a target antigen operably linked to the AAV capsid polypeptides.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Divisional application of, and claims the benefitunder 35 U.S.C. § 121 to, U.S. application Ser. No. 15/291,470, filedOct. 12, 2016, which is a Divisional application of, and claims thebenefit under 35 U.S.C. § 121 to, U.S. application Ser. No. 15/095,856filed Apr. 11, 2016, which is a Continuation-In-Part of InternationalApplication No. PCT/US2014/060163 filed Oct. 10, 2014, which claims thebenefit of priority under 35 U.S.C. § 119(e) to U.S. Application No.61/889,827 filed Oct. 11, 2013.

TECHNICAL FIELD

This disclosure generally relates to viruses.

BACKGROUND

Circumventing and avoiding a neutralizing or toxic immune responseagainst a gene therapy vector is a major challenge with all genetransfer vector types. Gene transfer to date is most efficientlyachieved using vectors based on viruses circulating in humans andanimals, e.g., adenovirus and adeno-associated virus (AAV). However, ifsubjects have been naturally infected with a virus, a subsequenttreatment with a vector based on that virus leads to increased safetyrisks and decreased efficiency of gene transfer due to cellular andhumoral immune responses. Capsid antigens are primarily responsible forthe innate and/or adaptive immunity toward virus particles, however,viral gene-encoded polypeptides also can be immunogenic.

SUMMARY

This disclosure describes methods of predicting and synthesizingancestral viral sequences or portions thereof, and also describes virusparticles containing such ancestral viral sequences. The methodsdescribed herein were applied to adeno-associated virus (AAV); thus,this disclosure describes predicted ancestral AAV sequences and AAVvirus particles containing such ancestral AAV sequences. This disclosurealso describes the reduced seroprevalance exhibited by virus particlescontaining ancestral sequences relative to virus particles containingcontemporary sequences.

In one aspect, this disclosure includes adeno-associated virus (AAV)capsid polypeptides, e.g., synthetic and/or artificial AAV capsidpolypeptides, having an amino acid sequence selected from the groupconsisting of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15 and 17. In someimplementations, the AAV capsid polypeptides or virus particlescomprising the AAV capsid polypeptides exhibit a lower seroprevalencethan do an AAV2 capsid polypeptide or a virus particle comprising anAAV2 capsid polypeptide, and the AAV capsid polypeptides or virusparticles comprising the AAV capsid polypeptides exhibit about the sameor a lower seroprevalence than do an AAV8 capsid polypeptide or a virusparticle comprising an AAV8 capsid polypeptide. In some embodiments, theAAV capsid polypeptides or virus particles comprising the AAV capsidpolypeptides are neutralized to a lesser extent by human serum than isan AAV2 capsid polypeptide or a virus particle comprising an AAV2 capsidpolypeptide, and the AAV capsid polypeptides or virus particlescomprising the AAV capsid polypeptides are neutralized to a similar orlesser extent by human serum as is an AAV8 capsid polypeptide or a virusparticle comprising an AAV8 capsid polypeptide. In some embodiments, theAAV capsid polypeptides are purified. The AAV capsid polypeptidesprovided herein can be encoded by a nucleic acid sequence selected fromthe group consisting of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, and 18.

In one aspect, the disclosure provides nucleic acid molecules, e.g.,synthetic and/or artificial nucleic acid molecules, encoding anadeno-associated virus (AAV) capsid polypeptide having a nucleic acidsequence selected from the group consisting of SEQ ID NOs: 2, 4, 6, 8,10, 12, 14, 16, and 18. Also provided are vectors that includes such anucleic acid, and a host cell that includes such a vector.

In another aspect, the disclosure provides purified virus particles thatinclude an AAV capsid polypeptide described herein. In some embodiments,the virus particles include a transgene.

In other aspects, the disclosure provides adeno-associated virus (AAV)capsid polypeptides, e.g., synthetic and/or artificial AAV capsidpolypeptides, having at least 95% (e.g., 97, 98, 99, or 100%) sequenceidentity to an amino acid sequence selected from the group consisting ofSEQ ID NOs: 19, 20, 21, 22, 23, 24, 25 and 26. In some embodiments, theAAV capsid polypeptides or virus particles comprising the AAV capsidpolypeptide exhibit a lower seroprevalence than does an AAV2 capsidpolypeptide or a virus particle comprising an AAV2 capsid polypeptide,and the AAV capsid polypeptide or a virus particle comprising the AAVcapsid polypeptide exhibit about the same or a lower seroprevalence thandoes an AAV8 capsid polypeptide or a virus particle comprising an AAV8capsid polypeptide. In some embodiments, the AAV capsid polypeptides orvirus particles comprising the AAV capsid polypeptide are neutralized toa lesser extent by human serum than is an AAV2 capsid polypeptide or avirus particle comprising an AAV2 capsid polypeptide, and the AAV capsidpolypeptide or a virus particle comprising the AAV capsid polypeptide isneutralized to a similar or lesser extent by human serum as is an AAV8capsid polypeptide or a virus particle comprising an AAV8 capsidpolypeptide. In some embodiments, the AAV capsid polypeptides arepurified.

In another aspect, the AAV capsid polypeptides described herein can beencoded by nucleic acid sequences as described herein. In oneimplementation, the disclosure provides nucleic acid molecules encodingan adeno-associated virus (AAV) capsid polypeptide, wherein the nucleicacid molecules have at least 95% (e.g., 97, 98, 99, or 100%) sequenceidentity to a nucleic acid sequence as shown herein. The disclosure alsoprovides vectors including such nucleic acid molecules, as are hostcells that include such a vector.

In one aspect, the disclosure provides virus particles that include atleast one of the AAV capsid polypeptides described herein. In someembodiments, the virus particles include a transgene.

In certain aspects, the disclosure provides methods of administering avirus particle as described herein to a subject in need of gene transferor vaccination. In some embodiments, the virus particles exhibit lessseroprevalence than does an AAV2 virus particle. In some embodiments,the virus particles exhibit about the same or less seroprevalence thandoes an AAV8 virus particle. In some embodiments, the virus particlesare neutralized to a lesser extent by human serum than is an AAV2 virusparticle, and the AAV virus particles are neutralized to a similar orlesser extent by human serum than is an AAV8 virus particle.

In one aspect, the disclosure provides methods of administering a targetantigen operably linked to an AAV capsid polypeptide as described hereinto a subject in need of vaccination. In some embodiments, the AAV capsidpolypeptides exhibit less seroprevalence than does an AAV2 capsidpolypeptide. In some embodiments, the AAV capsid polypeptide exhibitsabout the same or less seroprevalence than does an AAV8 capsidpolypeptide. In some embodiments, the AAV capsid polypeptides areneutralized to a lesser extent by human serum than is an AAV2 capsidpolypeptide, and the AAV capsid polypeptide is neutralized to a similaror lesser extent by human serum than is an AAV8 capsid polypeptide.

In another aspect, the disclosure provides in silico methods ofpredicting a sequence of an ancestral virus or portion thereof. Suchmethods typically include providing nucleotide or amino acid sequencesfrom a plurality of contemporary viruses or portions thereof; aligningthe sequences using a multiple sequence alignment (MSA) algorithm;modeling evolution to obtain a predicted ancestral phylogeny of theplurality of contemporary viruses or portions thereof; estimating, at aphylogenic node of the predicted ancestral phylogeny, the evolutionaryprobability of a particular nucleotide or amino acid residue at eachposition of the sequence; and predicting, based on the estimatedprobability at each position, a sequence of an ancestral virus orportion thereof.

In some embodiments, one or more, or all, of the steps are performedusing a computer processor. In some embodiments, the MSA algorithm usesphylogenetic information to predict if a gap in the alignment is aresult of a deletion or an insertion. In some embodiments, the MSAalgorithm is a Probabilistic Alignment Kit (PRANK). In some embodiments,the model used for modeling evolution is selected using AikakeInformation Criterion (AIC). In some embodiments, the predictedancestral phylogeny is obtained using a JTT model with a Gammadistribution model (“+G”) and a frequency calculation of πi (“+F”). Insome embodiments, the modeling the evolution step is performed using aJTT+G+F model. In some embodiments, the methods include synthesizing,based on the predicted sequence, the ancestral virus or portion thereof.In some embodiments, the methods include assembling the ancestral virusor portion thereof into an ancestral virus particle.

In some embodiments, the methods also include screening the ancestralvirus particle for at least one of the following: (a) replication; (b)gene transfer properties; (c) receptor binding; or (d) seroprevalence.In some embodiments, the ancestral virus particles exhibit lessseroprevalence than does a virus particle assembled from at least one ofthe plurality of contemporary viruses or portions thereof. In someembodiments, the ancestral virus particle is neutralized to a lesserextent by human serum than is a virus particle assembled from at leastone of the plurality of contemporary viruses or portions thereof. Insome embodiments, the plurality of contemporary viruses or portionsthereof belong to a family selected from the group consisting ofadenovirus (AV), human immunodeficiency virus (HIV), retrovirus,lentivirus, herpes simplex virus (HSV), vaccinia virus, pox virus,influenza virus, respiratory syncytial virus, parainfluenza virus, andfoamy virus.

Thus, the present disclosure provides ancestral viruses or portionsthereof that exhibit reduced susceptibility to pre-existing immunity incurrent day human populations than do contemporary viruses or portionsthereof. Generally, the reduced susceptibility to pre-existing immunityexhibited by the ancestral viruses or portions thereof in current dayhuman populations is reflected as a reduced susceptibility toneutralizing antibodies.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the methods and compositions of matter belong. Althoughmethods and materials similar or equivalent to those described hereincan be used in the practice or testing of the methods and compositionsof matter, suitable methods and materials are described below. Inaddition, the materials, methods, and examples are illustrative only andnot intended to be limiting. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety.

DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic showing the relationships betweenancestral/contemporary viral infections and ancestral/contemporary hostimmune response.

FIGS. 2A to 2D are a series of schematics showing an example of anancestral reconstruction procedure. Data shown are excerpted from a fulldataset and represent residues 564-584 (AAV2-VP1 numbering; SEQ ID NOs:37-43 (top to bottom)).

FIG. 3 illustrates a phylogenetic tree of AAV contemporary sequencesgenerated using the methods described herein.

FIG. 4 illustrates an alignment of ancestral AAV VP1 polypeptides (SEQID NOs: 23, 19, 24, 25, 26, 20, 21 and 22, top to bottom).

FIGS. 5A and 5B together illustrate an alignment of functional ancestralAAV VP1 polypeptides and contemporary AAV VP1 polypeptides (SEQ ID NOs:23, 19, 24, 25 21, 22, 26, 20, 27, 28, 29, 30, 31, 32, 33 and 34, top tobottom).

FIG. 6 is an electrophoretic gel demonstrating that ancestral AAV VP1sequences are transcribed and alternately spliced in a manner similar tothat for contemporary AAV VP1 sequences.

FIG. 7 is a graph showing the luciferase activity in HEK293 cellstransduced with ancestral AAV vectors.

FIG. 8 is a graph showing the sequence comparison (% up from diagonal, #of aa differences below) between the Anc80 library and Anc80L65.

FIGS. 9A-D are images of experimental results demonstrating thatAnc80L65 is capable of assembling and yielding particles of high titer.Panel A shows that Anc80L65 is able to produce vector yields equivalentto AAV2; Panel B is a TEM image of virus particles that includeAnc80L65; Panel C shows that virus particles that include Anc80L65 areable to produce AAV cap VP1, 2 and 3 proteins based on SDS-PAGE gelunder denaturing conditions; and Panel D shows a Western blot ofAnc80L65 using the AAV capsid antibody, B1.

FIGS. 10A-C are images of experimental results demonstrating thatAnc80L65 is able to infect cells in vitro on HEK293 cells using GFP asreadout (Panel A) or luciferase (Panel B) versus AAV2 and/or AAV8controls and also is efficient at targeting liver following an IVinjection of AAV encoding a nuclear LacZ transgene (top row, Panel C:liver), following direct IM injection of an AAV encoding GFP (middlerow, Panel C: muscle), and following sub-retinal injection with AAVencoding GFP (bottom row, Panel C: retina).

FIGS. 11A and 11B are sequence identity matrices producing using MAFFTthat show the amino acid sequences of the VP1 proteins of ancestralvectors aligned with those of representative extant AAVs (FIG.11A), andthe amino acid sequences of the VP3 proteins of ancestral vectorsaligned with those of representative extant AAVs (FIG. 11B).

FIG. 12 is a graph that demonstrates that AAV vectors were produced intriplicate in small scale (6-well dishes). Crude viruses were assessedvia qPCR to determine the absolute production of each vector.

FIG. 13 is a table showing the titers of each vector, averaged andcompared, to those of AAV8.

FIG. 14 are photographs that show the results of experiments in which1.9E3 GC/cell of each vector was added to HEK293 cells (except forAnc126, in which case MOIs of 2.5E2-3.1E2 GC/cell were achieved). Sixtyhours later, infectivity was assessed using fluorescence microscopy.

FIG. 15 is a graph showing the results of experiments in which the samecells from

FIG. 16 were lysed and assayed for luciferase expression. As in FIG. 14,Anc126 was not titer controlled with the other vectors, but ratherranged from an MOI of 2.5E2-3.1E2 GC/cell.

FIG. 16 is a table showing the luminescence of cells transduced by eachvector, which were averaged and compared to those of AAV8.

FIG. 17 is a chart that provides a summary of in vitro experiments todetermine the relative production and infectivity of the ancestral AAVvectors described herein.

DETAILED DESCRIPTION

Gene transfer, either for experimental or therapeutic purposes, reliesupon a vector or vector system to shuttle genetic information intotarget cells. The vector or vector system is considered the majordeterminant of efficiency, specificity, host response, pharmacology, andlongevity of the gene transfer reaction. Currently, the most efficientand effective way to accomplish gene transfer is through the use ofvectors or vector systems based on viruses that have been madereplication-defective.

Seroprevalence studies, however, indicate that significant proportionsof worldwide human populations have been pre-exposed (e.g., by naturalinfection) to a large number of the viruses currently used in genetransfer and, therefore, harbor pre-existing immunity. Neutralizingantibodies toward the viral vector in these pre-exposed individuals areknown to limit, sometimes significantly, the extent of gene transfer oreven re-direct the virus away from the target. See, for example, Calcedoet al. (2009, J. Infect. Dis., 199:381-90) and Boutin et al. (2010,Human Gene Ther., 21:704-12). Thus, the present disclosure is based onthe recognition that ancestral viruses or portions thereof exhibitreduced susceptibility to pre-existing immunity (e.g., reducedsusceptibility to neutralizing antibodies) in current day humanpopulations than do contemporary viruses or portions thereof.

FIG. 1 is a schematic showing the relationships between ancestral andcontemporary viral infections and ancestral and contemporary host immuneresponse. FIG. 1 shows how ancestral AAVs can be refractory tocontemporary pre-existing immunity. A contemporary, extant virus (Vc) ispresumed to have evolved from an ancestral species (Vanc), primarilyunder evolutionary pressures of host immunity through mechanisms ofimmune escape. Each of these species, Vanc and Vc, have the ability toinduce adaptive immunity including B and T cell immunity (Ianc and Ic,respectively). It was hypothesized, and confirmed herein, that immunityinduced by contemporary viruses does not necessarily cross-react with anancestral viral species, which can be substantially different in termsof epitope composition than the extant virus.

This disclosure provides methods of predicting the sequence of anancestral virus or a portion thereof. One or more of the ancestral virussequences predicted using the methods described herein can be generatedand assembled into a virus particle. As demonstrated herein, virusparticles assembled from predicted ancestral viral sequences can exhibitless, sometimes significantly less, seroprevalence than current-day,contemporary virus particles. Thus, the ancestral virus sequencesdisclosed herein are suitable for use in vectors or vector systems forgene transfer.

Methods of Predicting and Synthesizing an Ancestral Viral Sequence

To predict an ancestral viral sequence, nucleotide or amino acidsequences first are compiled from a plurality of contemporary viruses orportions thereof. While the methods described herein were exemplifiedusing adeno-associated virus (AAV) capsid sequences, the same methodscan be applied to other sequences from AAV (e.g., the entire genome, repsequences, ITR sequences) or to any other virus or portion thereof.Viruses other than AAV include, without limitation, adenovirus (AV),human immunodeficiency virus (HIV), retrovirus, lentivirus, herpessimplex virus (HSV), measles, vaccinia virus, pox virus, influenzavirus, respiratory syncytial virus, parainfluenza virus, foamy virus, orany other virus to which pre-existing immunity is considered a problem.

Sequences from as few as two contemporary viruses or portions thereofcan be used, however, it is understood that a larger number of sequencesof contemporary viruses or portions thereof is desirable so as toinclude as much of the landscape of modern day sequence diversity aspossible, but also because a larger number of sequences can increase thepredictive capabilities of the algorithms described and used. Forexample, sequences from 10 or more contemporary viruses or portionsthereof can be used, sequences from 50 or more contemporary viruses orportions thereof can be used, or sequences from 100 or more contemporaryviruses or portions thereof can be used.

Such sequences can be obtained, for example, from any number of publicdatabases including, without limitation, GenBank, UniProt, EMBL,International Nucleotide Sequence Database Collaboration (INSDC), orEuropean Nucleotide Archive. Additionally or alternatively, suchsequences can be obtained from a database that is specific to aparticular organism (e.g., HIV database). The contemporary sequences cancorrespond to the entire genome, or only a portion of the genome can beused such as, without limitation, sequences that encode one or morecomponents of the viral capsid, the replication protein, or the ITRsequences.

Next, the contemporary sequences are aligned using a multiple sequencealignment (MSA) algorithm. FIG. 2A is a schematic showing an alignmentof multiple sequences. MSA algorithms are well known in the art andgenerally are designed to be applied to different size datasets anddifferent inputs (e.g., nucleic acid or protein), and to align thesequences in a particular manner (e.g., dynamic programming,progressive, heuristic) and apply different scoring schemes in thealignment (e.g., matrix-based or consistency-based, e.g., minimumentropy, sum of pairs, similarity matrix, gap scores). Well known MSAalgorithms include, for example, ClustalW (Thompson et al., 1994, Nuc.Acids Res., 22:4673-90), Kalign (Lassmann et al., 2006, Nuc. Acids Res.,34:W596-99), MAFFT (Katoh et al., 2005, Nuc. Acids Res., 33:511-8),MUSCLE (Edgar, 2004, BMC Bioinform., 5:113), and T-Coffee (Notredame etal., 2000, J. Mol. Biol., 302:205-17).

As described herein, one of the main features when selecting a MSAalgorithm for use in the methods described herein is the manner in whichthe algorithm treats a gap in the alignment. Gaps in a sequencealignment can be assigned a penalty value that is either dependent orindependent on the size of the gap. In the present methods, it ispreferred that the MSA algorithm used in the methods described hereinapply phylogenetic information to predict whether a gap in the alignmentis a result of a deletion or an insertion as opposed to a biased,non-phylogenetic treatment of gaps due to, e.g., insertions and/ordeletions. A suitable method of treating gaps in alignments andevolutionary analysis is described in Loytynoja and Goldman, 2008,Science, 320:1632-5, and commercially available algorithms that applygaps in alignments in a manner that is suitable for use in the methodsdescribed herein is a Probabilistic Alignment Kit (PRANK; Goldman GroupSoftware; Loytynoja and Goldman, 2005, PNAS USA, 102:10557-62), andvariations of the PRANK algorithm.

An evolutionary model is then applied to the resulting alignment toobtain a predicted ancestral phylogeny (see FIG. 2B). There are a numberof evolutionary models available in the art, each of which applyslightly different matrices of replacement rates for amino acids.Without limitation, algorithms for applying models of evolution includethe Dayhoff models (e.g., PAM120, PAM160, PAM250; Dayhoff et al., 1978,In Atlas of Protein Sequence and Structure (ed. Dayhoff), pp. 345-52,National Biomedical Research Foundation, Washington D.C.), the JTT model(Jones et al., 1992, Comp. Appl. Biosci., 8:275-82), the WAG model(Whelan and Goldman, 2001, Mol. Biol. Evol., 18:691-9), and the Blosummodels (e.g., Blosum45, Blosum62, Blosum80; Henikoff and Henikoff, 1992,PNAS USA, 89:10915-9).

In addition, the constraints that structure and function impose on anevolutionary model can themselves be modeled, for example, byconsidering that some positions are invariant (“+I”; Reeves, 1992, J.Mol. Evol., 35:17-31), that some positions undergo different rates ofchange (“+G”; Yang, 1993, Mol. Biol. Evol., 10:1396-1401), and/or thatequilibrium frequencies of nucleotides or amino acids are the same asthose in the alignment (“+F”; Cao et al., 1994, J. Mol. Evol.,39:519-27).

The fitness of one or more models of evolution can be evaluated usingthe Aikake Information Criterion (AIC; Akaike, 1973, In SecondInternational Symposium on Information Theory, Petrov and Csaki, eds.,pp 267-81, Budapest, Akademiai Kiado), the Bayesian InformationCriterion (BIC; Schwarz, 1978, Ann. Statist. 6:461-4), or variations orcombinations thereof. In addition, AIC, BIC, or variations orcombinations thereof can be used to evaluate the relative importance ofincluding one or more parameters (e.g., the constraints discussed above)in the evolutionary model.

As explained in the Example section below, ProTest3 (Darriba et al.,2011, Bioinformatics, 27(8):1164-5) can be used to determine, based onthe lowest AIC, that a JTT+G+F algorithm was the most suitable model forAAV evolution. It would be understood by a skilled artisan that aJTT+G+F algorithm also may be used to predict ancestral viral sequencesother than AAV capsid polypeptides, however, it also would be understoodby a skilled artisan that, depending on the dataset and the fitnessscore, a different model of evolution may be more suitable.

Once a model of evolution has been selected and its fitness determined,a phylogenetic tree of the virus sequences or portions thereof can beconstructed. Constructing phylogenetic trees is known in the art andtypically employs maximum likelihood methods such as those implementedby PhyML (Guindon and Gascuel, 2003, Systematic Biology, 52:696-704)),MOLPHY (Adachi and Hasegawa, 1996, ed. Tokyo Institute of StatisticalMathematics), BioNJ (Gascuel, 1997, Mol. Biol. Evol., 14:685-95), orPHYLIP (Felsenstein, 1973, Systematic Biology, 22:240-9). A skilledartisan would understand that a balance between computational complexityand the goodness of fit is desirable in a model of amino acidsubstitutions.

If desired, the phylogenetic tree can be assessed for significance. Anumber of statistical methods are available and routinely used toevaluate the significance of a model including, without limitation,bootstrap, jackknife, cross-validation, permutation tests, orcombinations or variations thereof. Significance also can be evaluatedusing, for example, an approximate likelihood-ratio test (aLRT;Anisimova and Gascuel, 2006, Systematic Biology, 55:539-52)).

At any phylogenetic node of the phylogeny (e.g., an interiorphylogenetic node), the sequence can be reconstructed by estimating theevolutionary probability of a particular nucleotide or amino acidresidue at each position of the sequence (FIG. 2C). A phylogenic noderefers to an intermediate evolutionary branch point within the predictedancestral phylogeny. As used herein, “evolutionary probability” refersto the probability of the presence of a particular nucleotide or aminoacid at a particular position based on an evolutionary model as opposedto a model that does not take into account, for example, an evolutionaryshift in the codon usage. Exemplary models that take into account theevolutionary probability of a particular nucleotide or amino acidresidue at a particular position can be estimated using, for example,any number of maximum likelihood methods including, without limitation,Phylogenetic Analysis by Maximum Likelihood (PAML; Yang, 1997, Comp.Applic. BioSci., 13:555-6) or Phylogenetic Analysis Using Parsimony(PAUP; Sinauer Assoc., Inc., Sunderland, Mass.).

Based on the estimated evolutionary probability of a particularnucleotide or amino acid residue at each position, the predictedsequence of an ancestral virus or portion thereof can be assembled toform a complete or partial synthetic nucleic acid or polypeptidesequence. If desired, the likelihood that any residue was in a givenstate at a given node along the node can be calculated, and any positionalong the sequence having a calculated posterior probability beneath aparticular threshold can be identified (FIG. 2D). In this manner, anancestral scaffold sequence can be generated, which can includevariations at those positions having a probability below the particularthreshold.

If the ancestral sequence that is predicted using the methods herein isa nucleic acid sequence, the sequence then can be codon optimized sothat it can be efficiently translated into an amino acid sequence. Codonusage tables for different organisms are known in the art. Optionally,however, a codon usage table can be designed based on one or morecontemporary sequences that has homology (e.g., at least 90% sequenceidentity) to the ancestral scaffold sequence, and an ancestral sequenceas described herein can be codon optimized toward mammalian (e.g.,human) codon usage.

Any or all of the steps outlined herein for predicting an ancestralviral sequence can be performed or simulated on a computer (e.g., insilico) using a processor or a microprocessor.

Ancestral Adeno-Associated Virus (AAV) Scaffold Sequences

The methods described herein were applied to adeno-associated virus(AAV) using contemporary capsid sequences (described in detail in theExamples below). AAV is widely considered as a therapeutic gene transfervector and a genetic vaccine vehicle, but exhibits a high seroprevalencein human populations. Using the methods described herein, a phylogenetictree was assembled using contemporary AAV sequences (see FIGS. 3A-3C)and predicted ancestral scaffold sequences were obtained at thedesignated phylogenic node (Table 1). As used herein, an ancestralscaffold sequence refers to a sequence that is constructed using themethods described herein (e.g., using evolutionary probabilities andevolutionary modeling) and is not known to have existed in nature. Asused herein, the ancestral scaffold sequences are different fromconsensus sequences, which are typically constructed using the frequencyof nucleotides or amino acid residues at a particular position.

TABLE 1 Polypeptide Nucleic Acid Node (SEQ ID NO) (SEQ ID NO) Anc80 1 2Anc81 3 4 Anc82 5 6 Anc83 7 8 Anc84 9 10 Anc94 11 12 Anc113 13 14 Anc12615 16 Anc127 17 18

The scaffold sequence of the Anc80 polypeptide is shown in SEQ ID NO:1,which is encoded by the scaffold sequence of the Anc80 nucleic acidshown in SEQ ID NO:2. The scaffold sequence of Anc80 contains 11positions at which either of two residues were probable. Therefore, theAnc80 scaffold sequence represents 2048 (2¹¹) different sequences.

To demonstrate the effectiveness of the methods described herein forpredicting the ancestral sequence of a virus or portion thereof, alibrary of the 2048 predicted ancestral sequences at the AAV Anc80 nodewas generated and, as described herein, demonstrated to form viablevirus particles exhibiting less seroprevalence, in some instances,significantly less seroprevalance, than virus particles assembled withcontemporary capsid polypeptides.

Methods of Making Ancestral Virus Particles

After the predicted ancestral sequence of a virus or portion thereof hasbeen obtained, the actual nucleic acid molecule and/or polypeptide(s)can be generated, e.g., synthesized. Methods of generating an artificialnucleic acid molecule or polypeptide based on a sequence obtained, forexample, in silico, are known in the art and include, for example,chemical synthesis or recombinant cloning. Additional methods forgenerating nucleic acid molecules or polypeptides are known in the artand are discussed in more detail below.

Once an ancestral polypeptide has been produced, or once an ancestralnucleic acid molecule has been generated and expressed to produce anancestral polypeptide, the ancestral polypeptide can be assembled intoan ancestral virus particle using, for example, a packaging host cell.The components of a virus particle (e.g., rep sequences, cap sequences,inverted terminal repeat (ITR) sequences) can be introduced, transientlyor stably, into a packaging host cell using one or more vectors asdescribed herein. One or more of the components of a virus particle canbe based on a predicted ancestral sequence as described herein, whilethe remaining components can be based on contemporary sequences. In someinstances, the entire virus particle can be based on predicted ancestralsequences.

Such ancestral virus particles can be purified using routine methods. Asused herein, “purified” virus particles refer to virus particles thatare removed from components in the mixture in which they were made suchas, but not limited to, viral components (e.g., rep sequences, capsequences), packaging host cells, and partially- orincompletely-assembled virus particles.

Once assembled, the ancestral virus particles can be screened for, e.g.,the ability to replicate; gene transfer properties; receptor bindingability; and/or seroprevalence in a population (e.g., a humanpopulation). Determining whether a virus particle can replicate isroutine in the art and typically includes infecting a host cell with anamount of virus particles and determining if the virus particlesincrease in number over time. Determining whether a virus particle iscapable of performing gene transfer also is routine in the art andtypically includes infecting host cells with virus particles containinga transgene (e.g., a detectable transgene such as a reporter gene,discussed in more detail below). Following infection and clearance ofthe virus, the host cells can be evaluated for the presence or absenceof the transgene. Determining whether a virus particle binds to itsreceptor is routine in the art, and such methods can be performed invitro or in vivo.

Determining the seroprevalence of a virus particle is routinelyperformed in the art and typically includes using an immunoassay todetermine the prevalence of one or more antibodies in samples (e.g.,blood samples) from a particular population of individuals.Seroprevalence is understood in the art to refer to the proportion ofsubjects in a population that is seropositive (i.e., has been exposed toa particular pathogen or immunogen), and is calculated as the number ofsubjects in a population who produce an antibody against a particularpathogen or immunogen divided by the total number of individuals in thepopulation examined. Immunoassays are well known in the art and include,without limitation, an immunodot, Western blot, enzyme immunoassays(EIA), enzyme-linked immunosorbent assay (ELISA), or radioimmunoassay(RIA). As indicated herein, ancestral virus particles exhibit lessseroprevalence than do contemporary virus particles (i.e., virusparticles assembled using contemporary virus sequences or portionsthereof). Simply by way of example, see Xu et al. (2007, Am. J. Obstet.Gynecol., 196:43.e1-6); Paul et al. (1994, J. Infect. Dis., 169:801-6);Sauerbrei et al. (2011, Eurosurv., 16(44):3); and Sakhria et al. (2013,PLoS Negl. Trop. Dis., 7:e2429), each of which determined seroprevalencefor a particular antibody in a given population.

As described herein, ancestral virus particles are neutralized by aperson's, e.g., patient's, immune system to a lesser extent than arecontemporary virus particles. Several methods to determine the extent ofneutralizing antibodies in a serum sample are available. For example, aneutralizing antibody assay measures the titer at which an experimentalsample contains an antibody concentration that neutralizes infection by50% or more as compared to a control sample without antibody. See, also,Fisher et al. (1997, Nature Med., 3:306-12) and Manning et al. (1998,Human Gene Ther., 9:477-85).

With respect to the ancestral AAV capsid polypeptides exemplifiedherein, the seroprevalence and/or extent of neutralization can becompared, for example, to an AAV8 capsid polypeptide or virus particlethat includes an AAV8 capsid polypeptide, or an AAV2 capsid polypeptideor virus particle that includes an AAV2 capsid polypeptide. It isgenerally understood in the art that AAV8 capsid polypeptides or virusparticles exhibit a seroprevalance, and a resulting neutralization, inthe human population that is considered low, while AAV2 capsidpolypeptide or virus particles exhibit a seroprevalance, and a resultingneutralization, in the human population that is considered high.Obviously, the particular seroprevalence will depend upon the populationexamined as well as the immunological methods used, but there arereports that AAV8 exhibits a seroprevalence of about 22% up to about38%, while AAV2 exhibits a seroprevalence of about 43.5% up to about72%. See, for example, Boutin et al., 2010, “Prevalence of serum IgG andneutralizing factors against AAV types 1, 2, 5, 6, 8 and 9 in thehealthy population: implications for gene therapy using AAV vectors,”Hum. Gene Ther., 21:704-12. See, also, Calcedo et al., 2009, J. Infect.Dis., 199:381-90.

Predicted Adeno-Associated Virus (AAV) Ancestral Nucleic Acid andPolypeptide Sequences

A number of different clones from the library encoding predictedancestral capsid polypeptides from the Anc80 node were sequenced, andthe amino acid sequences of representative AAV predicted ancestralcapsid polypeptides are shown in SEQ ID NO: 19 (Anc80L27); SEQ ID NO: 20(Anc80L59); SEQ ID NO: 21 (Anc80L60); SEQ ID NO: 22 (Anc80L62); SEQ IDNO: 23 (Anc80L65); SEQ ID NO: 24 (Anc80L33); SEQ ID NO: 25 (Anc80L36);and SEQ ID NO:26 (Anc80L44). Those skilled in the art would appreciatethat the nucleic acid sequence encoding each amino acid sequence canreadily be determined.

In addition to the predicted ancestral capsid polypeptides having thesequences shown in SEQ ID NOs: 19, 20, 21, 22, 23, 24, 25 or 26,polypeptides are provided that have at least 95% sequence identity(e.g., at least 96%, at least 97%, at least 98%, at least 99% or 100%sequence identity) to the predicted ancestral capsid polypeptides havingthe sequences shown in SEQ ID NOs: 19, 20, 21, 22, 23, 24, 25, or 26.Similarly, nucleic acid molecules are provided that have at least 95%sequence identity (e.g., at least 96%, at least 97%, at least 98%, atleast 99% or 100% sequence identity) to the nucleic acid moleculesencoding the ancestral capsid polypeptides (i.e., having at least 95%sequence identity).

In calculating percent sequence identity, two sequences are aligned andthe number of identical matches of nucleotides or amino acid residuesbetween the two sequences is determined. The number of identical matchesis divided by the length of the aligned region (i.e., the number ofaligned nucleotides or amino acid residues) and multiplied by 100 toarrive at a percent sequence identity value. It will be appreciated thatthe length of the aligned region can be a portion of one or bothsequences up to the full-length size of the shortest sequence. It alsowill be appreciated that a single sequence can align with more than oneother sequence and hence, can have different percent sequence identityvalues over each aligned region.

The alignment of two or more sequences to determine percent sequenceidentity can be performed using the algorithm described by Altschul etal. (1997, Nucleic Acids Res., 25:3389 3402) as incorporated into BLAST(basic local alignment search tool) programs, available atncbi.nlm.nih.gov on the World Wide Web. BLAST searches can be performedto determine percent sequence identity between a sequence (nucleic acidor amino acid) and any other sequence or portion thereof aligned usingthe Altschul et al. algorithm. BLASTN is the program used to align andcompare the identity between nucleic acid sequences, while BLASTP is theprogram used to align and compare the identity between amino acidsequences. When utilizing BLAST programs to calculate the percentidentity between a sequence and another sequence, the default parametersof the respective programs generally are used.

Representative alignments are shown in FIGS. 4A and 4B and FIGS. 5A and5B. FIGS. 4A and 4B show an alignment of ancestral AAV VP1 capsidpolypeptides, designated Anc80L65 (SEQ ID NO: 23), Anc80L27 (SEQ ID NO:19), Anc80L33 (SEQ ID NO: 24), Anc80L36 (SEQ ID NO: 25), Anc80L44 (SEQID NO: 26), Anc80L59 (SEQ ID NO: 20), Anc80L60 (SEQ ID NO: 21), andAnc80L62 (SEQ ID NO: 22). The alignment shown in FIGS. 4A and 4Bconfirms the predicted variation at each of the 11 sites, and a singlenon-synonymous mutation at position 609E of Anc80L60 (SEQ ID NO: 21),which may be a cloning artifact. FIGS. 5A and 5B shows an alignmentbetween ancestral AAV VP1 capsid polypeptides (Anc80L65 (SEQ ID NO: 23),Anc80L27 (SEQ ID NO: 19), Anc80L33 (SEQ ID NO: 24), Anc80L36 (SEQ ID NO:25), Anc80L60 (SEQ ID NO: 21), Anc80L62 (SEQ ID NO: 22), Anc80L44 (SEQID NO: 26), and Anc80L59 (SEQ ID NO: 20)) and contemporary AAV VP1capsid polypeptides (AAV8 (SEQ ID NO: 27), AAV9 (SEQ ID NO: 28), AAV6(SEQ ID NO: 29), AAV1 (SEQ ID NO: 30), AAV2 (SEQ ID NO: 31), AAV3 (SEQID NO: 32), AAV3B (SEQ ID NO: 33), and AAV7 (SEQ ID NO: 34)). Thealignment in FIGS. 5A and 5B shows that the ancestral AAV sequences havebetween about 85% and 91% sequence identity to contemporary AAVsequences.

Vectors containing nucleic acid molecules that encode polypeptides alsoare provided. Vectors, including expression vectors, are commerciallyavailable or can be produced by recombinant technology. A vectorcontaining a nucleic acid molecule can have one or more elements forexpression operably linked to such a nucleic acid molecule, and furthercan include sequences such as those encoding a selectable marker (e.g.,an antibiotic resistance gene), and/or those that can be used inpurification of a polypeptide (e.g., 6×His tag). Elements for expressioninclude nucleic acid sequences that direct and regulate expression ofnucleic acid coding sequences. One example of an expression element is apromoter sequence. Expression elements also can include one or more ofintrons, enhancer sequences, response elements, or inducible elementsthat modulate expression of a nucleic acid molecule. Expression elementscan be of bacterial, yeast, insect, mammalian, or viral origin andvectors can contain a combination of expression elements from differentorigins. As used herein, operably linked means that elements forexpression are positioned in a vector relative to a coding sequence insuch a way as to direct or regulate expression of the coding sequence.

A nucleic acid molecule, e.g., a nucleic acid molecule in a vector(e.g., an expression vector, a viral vector) can be introduced into ahost cell. The term “host cell” refers not only to the particularcell(s) into which the nucleic acid molecule has been introduced, butalso to the progeny or potential progeny of such a cell. Many suitablehost cells are known to those skilled in the art; host cells can beprokaryotic cells (e.g., E. coli) or eukaryotic cells (e.g., yeastcells, insect cells, plant cells, mammalian cells). Representative hostcells can include, without limitation, A549, WEHI, 3T3, 10T1/2, BHK,MDCK, COS 1, COS 7, BSC 1, BSC 40, BMT 10, VERO, WI38, HeLa, 293 cells,Saos, C2C12, L cells, HT1080, HepG2 and primary fibroblast, hepatocyteand myoblast cells derived from mammals including human, monkey, mouse,rat, rabbit, and hamster. Methods for introducing nucleic acid moleculesinto host cells are well known in the art and include, withoutlimitation, calcium phosphate precipitation, electroporation, heatshock, lipofection, microinjection, and viral-mediated nucleic acidtransfer (e.g., transduction).

With respect to polypeptides, “purified” refers to a polypeptide (i.e.,a peptide or a polypeptide) that has been separated or purified fromcellular components that naturally accompany it. Typically, thepolypeptide is considered “purified” when it is at least 70% (e.g., atleast 75%, 80%, 85%, 90%, 95%, or 99%) by dry weight, free from thepolypeptides and naturally occurring molecules with which it isnaturally associated. Since a polypeptide that is chemically synthesizedis, by nature, separated from the components that naturally accompanyit, a synthetic polypeptide is considered “purified,” but further can beremoved from the components used to synthesize the polypeptide (e.g.,amino acid residues). With respect to nucleic acid molecules, “isolated”refers to a nucleic acid molecule that is separated from other nucleicacid molecules that are usually associated with it in the genome. Inaddition, an isolated nucleic acid molecule can include an engineerednucleic acid molecule such as a recombinant or a synthetic nucleic acidmolecule.

Polypeptides can be obtained (e.g., purified) from natural sources(e.g., a biological sample) by known methods such as DEAE ion exchange,gel filtration, and/or hydroxyapatite chromatography. A purifiedpolypeptide also can be obtained, for example, by expressing a nucleicacid molecule in an expression vector or by chemical synthesis. Theextent of purity of a polypeptide can be measured using any appropriatemethod, e.g., column chromatography, polyacrylamide gel electrophoresis,or HPLC analysis. Similarly, nucleic acid molecules can be obtained(e.g., isolated) using routine methods such as, without limitation,recombinant nucleic acid technology (e.g., restriction enzyme digestionand ligation) or the polymerase chain reaction (PCR; see, for example,PCR Primer: A Laboratory Manual, Dieffenbach & Dveksler, Eds., ColdSpring Harbor Laboratory Press, 1995). In addition, isolated nucleicacid molecules can be chemically synthesized.

Methods of Using Ancestral Viruses or Portions Thereof

An ancestral virus or portion thereof as described herein, particularlythose that exhibit reduced seroprevalence relative to contemporaryviruses or portions thereof, can be used in a number of research and/ortherapeutic applications. For example, an ancestral virus or portionthereof as described herein can be used in human or animal medicine forgene therapy (e.g., in a vector or vector system for gene transfer) orfor vaccination (e.g., for antigen presentation). More specifically, anancestral virus or portion thereof as described herein can be used forgene addition, gene augmentation, genetic delivery of a polypeptidetherapeutic, genetic vaccination, gene silencing, genome editing, genetherapy, RNAi delivery, cDNA delivery, mRNA delivery, miRNA delivery,miRNA sponging, genetic immunization, optogenetic gene therapy,transgenesis, DNA vaccination, or DNA immunization.

A host cell can be transduced or infected with an ancestral virus orportion thereof in vitro (e.g., growing in culture) or in vivo (e.g., ina subject). Host cells that can be transduced or infected with anancestral virus or portion thereof in vitro are described herein; hostcells that can be transduced or infected with an ancestral virus orportion thereof in vivo include, without limitation, brain, liver,muscle, lung, eye (e.g., retina, retinal pigment epithelium), kidney,heart, gonads (e.g., testes, uterus, ovaries), skin, nasal passages,digestive system, pancreas, islet cells, neurons, lymphocytes, ear(e.g., inner ear), hair follicles, and/or glands (e.g., thyroid).

An ancestral virus or portion thereof as described herein can bemodified to include a transgene (in cis or trans with other viralsequences). A transgene can be, for example, a reporter gene (e.g.,beta-lactamase, beta-galactosidase (LacZ), alkaline phosphatase,thymidine kinase, green fluorescent polypeptide (GFP), chloramphenicolacetyltransferase (CAT), or luciferase, or fusion polypeptides thatinclude an antigen tag domain such as hemagglutinin or Myc) or atherapeutic gene (e.g., genes encoding hormones or receptors thereof,growth factors or receptors thereof, differentiation factors orreceptors thereof, immune system regulators (e.g., cytokines andinterleukins) or receptors thereof, enzymes, RNAs (e.g., inhibitory RNAsor catalytic RNAs), or target antigens (e.g., oncogenic antigens,autoimmune antigens)).

The particular transgene will depend, at least in part, on theparticular disease or deficiency being treated. Simply by way ofexample, gene transfer or gene therapy can be applied to the treatmentof hemophilia, retinitis pigmentosa, cystic fibrosis, leber congenitalamaurosis, lysosomal storage disorders, inborn errors of metabolism(e.g., inborn errors of amino acid metabolism including phenylketonuria,inborn errors of organic acid metabolism including propionic academia,inborn errors of fatty acid metabolism including medium-chain acyl-CoAdehydrogenase deficiency (MCAD)), cancer, achromatopsia, cone-roddystrophies, macular degenerations (e.g., age-related maculardegeneration), lipopolypeptide lipase deficiency, familialhypercholesterolemia, spinal muscular atrophy, Duchenne's musculardystrophy, Alzheimer's disease, Parkinson's disease, obesity,inflammatory bowel disorder, diabetes, congestive heart failure,hypercholesterolemia, hearing loss, coronary heart disease, familialrenal amyloidosis, Marfan's syndrome, fatal familial insomnia,Creutzfeldt-Jakob disease, sickle-cell disease, Huntington's disease,fronto-temporal lobar degeneration, Usher syndrome, lactose intolerance,lipid storage disorders (e.g., Niemann-Pick disease, type C), Battendisease, choroideremia, glycogen storage disease type II (Pompedisease), ataxia telangiectasia (Louis-Bar syndrome), congenitalhypothyroidism, severe combined immunodeficiency (SCID), and/oramyotrophic lateral sclerosis (ALS).

A transgene also can be, for example, an immunogen that is useful forimmunizing a subject (e.g., a human, an animal (e.g., a companionanimal, a farm animal, an endangered animal). For example, immunogenscan be obtained from an organism (e.g., a pathogenic organism) or animmunogenic portion or component thereof (e.g., a toxin polypeptide or aby-product thereof). By way of example, pathogenic organisms from whichimmunogenic polypeptides can be obtained include viruses (e.g.,picornavirus, enteroviruses, orthomyxovirus, reovirus, retrovirus),prokaryotes (e.g., Pneumococci, Staphylococci, Listeria, Pseudomonas),and eukaryotes (e.g., amebiasis, malaria, leishmaniasis, nematodes). Itwould be understood that the methods described herein and compositionsproduced by such methods are not to be limited by any particulartransgene.

An ancestral virus or portion thereof, usually suspended in aphysiologically compatible carrier, can be administered to a subject(e.g., a human or non-human mammal). Suitable carriers include saline,which may be formulated with a variety of buffering solutions (e.g.,phosphate buffered saline), lactose, sucrose, calcium phosphate,gelatin, dextran, agar, pectin, and water. The ancestral virus orportion thereof is administered in sufficient amounts to transduce orinfect the cells and to provide sufficient levels of gene transfer andexpression to provide a therapeutic benefit without undue adverseeffects. Conventional and pharmaceutically acceptable routes ofadministration include, but are not limited to, direct delivery to anorgan such as, for example, the liver or lung, orally, intranasally,intratracheally, by inhalation, intravenously, intramuscularly,intraocularly, subcutaneously, intradermally, transmucosally, or byother routes of administration. Routes of administration can becombined, if desired.

The dose of the ancestral virus or portion thereof administered to asubject will depend primarily on factors such as the condition beingtreated, and the age, weight, and health of the subject. For example, atherapeutically effective dosage of an ancestral virus or portionthereof to be administered to a human subject generally is in the rangeof from about 0.1 ml to about 10 ml of a solution containingconcentrations of from about 1×10¹ to 1×10¹² genome copies (GCs) ofancestral viruses (e.g., about 1×10³ to 1×10⁹ GCs). Transduction and/orexpression of a transgene can be monitored at various time pointsfollowing administration by DNA, RNA, or protein assays. In someinstances, the levels of expression of the transgene can be monitored todetermine the frequency and/or amount of dosage. Dosage regimens similarto those described for therapeutic purposes also may be utilized forimmunization.

The methods described herein also can be used to model forwardevolution, so as to modify or ablate one or more immunogenic domains ofa virus or portion thereof.

In accordance with the present invention, there may be employedconventional molecular biology, microbiology, biochemical, andrecombinant DNA techniques within the skill of the art. Such techniquesare explained fully in the literature. The invention will be furtherdescribed in the following examples, which do not limit the scope of themethods and compositions of matter described in the claims.

EXAMPLES Example 1 Computational Prediction of Ancestral Sequences

A set of 75 different amino acid sequences of AAV capsids was obtainedfrom a number of public databases including GenBank, and the sequenceswere aligned using the PRANK-MSA algorithm, version 121002, with theoption “−F”.

ProtTest3 (see, for example, Darriba et al., 2011, Bioinformatics,27(8):1164-5; available at darwin.uvigo.es/software/prottest3 on theWorld Wide Web) was used to evaluate different models of polypeptideevolution (e.g., those included in ProTest3, namely, JTT, LG, WAG, VT,CpRev, RtRev, Dayhoff, DCMut, FLU, Blosum62, VT, HIVb, MtArt, MtMam)under different conditions (e.g., those included in ProTest3, namely,“+I”, “+F”, “+G”, and combinations thereof). The JTT model (Jones etal., 1992, Comp. Appl. Biosci., 8:275-82) with +G and +F (Yang, 1993,Mol. Biol. Evol., 10:1396-1401; and Cao et al., 1994, J. Mol. Evol.,39:519-27) was selected based on its Aikake Information Criterion (AIC;Hirotugu, 1974, IEEE Transactions on Automatic Control, 19:716-23) scoreas implemented in ProTest3.

A phylogeny of AAV evolution was constructed using PhyML (Guindon andGascuel, 2003, Systematic Biology, 52:696-704)). See FIG. 3. The treewas generated using the JTT+F substitution model with 4 discretesubstitution categories and an estimated Gamma shape parameter. Theresultant trees were improved via Nearest Neighbor Interchange (NNI) andSubtree Pruning and Re-Grafting (SPR), and assessed for significance viabootstrap and approximate likelihood-ratio test (aLRT; Anisimova andGascuel, 2006, Systematic Biology, 55:539-52)) using the “SH-Like”variant.

The phylogenic tree constructed above was then used to estimate theancestral states of the AAV capsid at every node interior to thephylogeny. The ancestral capsid sequences were reconstructed usingmaximum likelihood principles through the Phylogenetic Analysis byMaximum Likelihood (PAML) software (Yang, 1997, Comp. Applic. BioSci.,13:555-6; available at abacus.gene.ucl.ac.uk/software/paml.html on theWorld Wide Web) wrapped in Lazarus (Sourceforge at sf.net). Morespecifically, the Lazarus/PAML reconstruction was set to generate anamino acid reconstruction using the JTT+F substitution model using 4gamma-distributed categories. AAVS was used as an outgroup. Finally, the“I” option was added to place indels (i.e., coded binarily and placedvia Maximum Parsimony using Fitch's algorithm) after the PAMLreconstruction was done.

Because the reconstruction was done in a maximum-likelihood fashion, thelikelihood that any residue was in a given position at a given node canbe calculated. To do this, an additional script was written to identifyall positions along the sequence with a calculated posterior probabilitybeneath a certain threshold. A threshold of 0.3 was selected, meaningthat any amino acid with a calculated posterior probability of greaterthan 0.3 was included in the synthesis of the library. These residueswere selected to be variants of interest in the library.

To finalize the sequence, an additional utility had to be coded toselect codons. A script was written to derive codons similar to those ofanother AAV sequence (AVVRh10, which has about 92% sequence identity tothe Anc80 scaffold sequence) and apply a novel algorithm to substitutecodons where there were sequence mismatches based on acodon-substitution matrix. The novel algorithm is shown below:

-   -   Given: amino acid sequence, Pt, with corresponding nucleotide        sequence, Nt, where Nt codes for Pt; and protein sequence, Pi,        where Pi exhibits strong homology to Pt.    -   Align Pi with Pt using Needleman-Wunsch using the Blosum62 table        for scoring. Generate a new nucleotide sequence, Ni, by stepping        through the protein alignment, using the corresponding codon        from Nt,        -   where the amino acid in Pt exactly matches that in Pi,        -   the “best scoring” codon from the Codon-PAM matrix            (Schneider et al., 2005, BMC Bioinform., 6:134) where there            is a substitution,        -   a gap where there exists a gap in Pi aligned against an            amino-acid in Pt, and        -   the most frequently occurring nucleotide in the Nt (coding            for a given amino acid) where there exists an amino-acid in            Pi aligned against a gap in Pt.

In addition, two single nucleotide changes were made to ablatetranscription of assembly-activating protein (AAP), which is encoded outof frame within the AAV capsid gene in the wild type AAV. Since thecoding of AAP (contemporary or ancestral) was not a part of thisreconstruction, the expression of AAP was ablated by making a synonymousmutation in the cap sequence, and the AAP sequence was provided in transduring viral production.

Example 2 Expression of Ancestral AAV VP1 Sequences

Experiments were performed to determine whether predicted ancestral AAVcapsid sequences can be used to make viral vectors.

A number of the predicted ancestral AAV capsid sequences were cloned.The library of ancestral capsids was transferred to a rep-cap expressionplasmid to enable viral particle formation in transient transfection. Tomaintain appropriate expression levels and splicing of VP1, VP2, andVP3, library cap genes were cloned by cutting HindIII, located 5′ of capin the rep coding sequence, and SpeI, which was engineered between thecap stop codon and the polyadenylation signal. Consequently, to clonethe ancestral capsids into a more conventional “REP/CAP” construct, thepassaging-plasmid was digested with HindIII and SpeI, gel purified, andligated into a similarly digested rep/cap plasmid.

The expressed polypeptides were resolved on a 10% SDS gel. As shown inFIG. 6, the capsid polypeptides were appropriately expressed and splicedinto VP1, VP2, and VP3 from a number of ancestral AAV sequences(Anc80L44, Anc80L27, and Anc80L65) as well as from a contemporary AAVsequence, AAV2/8.

Example 3 Viral Titration

AAV was produced in HEK293 cells via transient co-transfection ofplasmids encoding all elements required for viral particle assembly.Briefly, HEK293 cells were grown to 90% confluency and transfected with(a) the viral genome plasmid encoding the luciferase transgene(expressed by the CMV promoter) flanked by AAV2 ITRs, (b) the AAVpackaging plasmid encoding AAV2 rep and the synthesized capsid proteinsdisclosed herein, (c) AAV2-AAP expressing capsid, and (d) adenoviralhelper genes needed for AAV packaging and assembly. Cells were incubatedat 37° C. for 2 days, and cells and media were harvested and collected.

The cell-media suspension was lysed by 3 consecutive freeze-thaw cycles.Next, the lysate was cleared by centrifugation and treated with anenzyme under conditions to perform exhaustive DNA digestion, hereBENZONASE™, to digest any DNA present outside of the virus particle. TheAAV preparation was diluted to fall within the linear measurement rangeof a control DNA template, in this case linearized plasmid withidentical TAQMAN™ primer and probe binding sequence as compared to thevector genome. TAQMAN™ PCR was performed with primers and probeannealing to the viral vector genome of choice. Titer was calculatedbased on the TAQMAN™ measurement in genome copies (GC) per milliliter(ml) as shown in Table 2 below.

TABLE 2 Titers (GC/ml) Small scale #1 Small scale #2 AAV2/2 1.12 × 10⁹1.99 × 10⁹ AAV2/8  4.17 × 10¹⁰  5.91 × 10¹⁰ Anc80L27 8.01 × 10⁸ 1.74 ×10⁹ Anc80L44 1.52 × 10⁹ 1.43 × 10⁹ Anc80L65 1.42 × 10⁹ 2.05 × 10⁹ Nocapsid control 5.23 × 10⁵ 7.25 × 10⁵

Small scale vector production results on ancestrally reconstructed AAVcapsid particles demonstrated yields that were similar to AAV2, butreduced relative to AAV8, both of which are vector preparations based oncontemporary AAVs.

Example 4 In Vitro Viral Transduction

In vitro viral transductions were performed to evaluate the ability ofviruses containing the predicted ancestral AAV sequences to infectcells.

Following high throughput vector production using the Anc80 library ofsequences, HEK293 cells were transduced with each viral vector. Inaddition to an Anc80 sequence, each viral vector contained a luciferasetransgene. Luciferase was measured by quantification of bioluminescencein a 96 well plate reader following addition of luciferin substrate tothe transduced cells or cell lysate. Following quantification, a heatmap of luciferase expression in four concatenated 96-well plates wasproduced (excluding a column of controls in each plate). Due to thelarge number of insertions, deletions, and transitions associated withthe process of high throughput vector production, many of the vectorswere non-functional. For purposes herein, only viruses that werefunctional in this assay (i.e., able to transduce HEK293 cells andexpress the transgene) were evaluated further.

HEK293 cells were transduced, at equal multiplicity of infection (MOI)of 1×10⁴ genome copies (GC) per cell, with two contemporary AAV vectors(AAV2/2 and AAV2/8) and three predicted ancestral AAV vectors (Anc80L27,Anc80L44, and Anc80L65). Each vector contained either aluciferase-encoding transgene or an eGFP-encoding transgene. Cells wereimaged 60 hours later using the GFP channel of an AMG EvosF1 OpticalMicroscope. FIG. 7 shows the luciferase expression following the invitro transduction. Each of the ancestral AAV viruses demonstratedefficient transduction of HEK293 cells.

Example 5 In Vivo Retinal Transduction

Retinal transductions were performed to determine whether or not theancestral AAV vectors are able to target murine retinal cells in vivo.

Murine eyes were transduced with 2×10⁸ genome copies (GC) of threedifferent ancestral AAVs (Anc80L27, Anc80L44, and Anc80L65) and acontemporary AAV (AAV2/8), all of which included an eGFP-encodingtransgene. For transductions, each AAV vector was surgically deliveredbelow the retina by generating a space between the photoreceptor andretinal pigment epithelium layer through delivery of a vector bolus withan injection device. The vector bolus was left in the sub-retinal spaceand the sub-retinal detachment resolved over time. GFP expression wasmonitored non-invasively by fundus photography of the retina of theanimal following pupil dilation with TROPICAMIDE™. All of the presentedretinas demonstrated varying degrees of successful targeting ofancestral AAVs to the retina.

Retinal histology also was performed and visualized under fluorescentmicroscopy to identify the transduced cell type(s). Histology wasperformed on a murine retina transduced with the Anc80L65 ancestral AAVvector as described above. Anc80L65-mediated eGFP expression was evidentin the outer nuclear layer (ONL), the inner segments (IS), and theretinal pigment epithelium (RPE), indicating that the ancestral Anc80L65vector targets murine photoreceptors and retinal pigment epithelialcells.

Example 6 Neutralizing Antibody Assay

Neutralizing antibody assays were performed to evaluate whether or notan ancestral AAV virus is more resistant to antibody-neutralization thana contemporary AAV virus. Neutralizing antibody assays measure theantibody concentration (or the titer at which an experimental samplecontains an antibody concentration) that neutralizes an infection by 50%or more as compared to a control in the absence of the antibody.

Serum samples or IVIG stock solution (200 mg/ml) were serially dilutedby 2-fold, and undiluted and diluted samples were co-incubated with anancestral AAV virus, Anc80L65, and a contemporary AAV virus, AAV2/8, ata MOI of 10⁴ for about 30 minutes at 37° C. Each virus included aluciferase transgene. The admixed vector and an antibody sample thenwere transduced into HEK293 cells. For these experiments, the antibodysample used was intravenous immunoglobulin (IVIG), pooled IgGs extractedfrom the plasma of over one thousand blood donors (sold commercially,for example, as GAMMAGARD™ (Baxter Healthcare; Deerfield, Ill.) orGAMUNEX™ (Grifols; Los Angeles, Calif.)). 48 hours following initiationof transduction, cells were assayed by bioluminescence to detectluciferase. Neutralizing antibody titer was determined by identifyingthe dilution of sample for which 50% or more neutralization(transduction of sample/transduction of control virus in absence ofsample) was reached.

Example 7 Characterization of Anc80

Based on the methods described herein, the most probable Anc80 sequence(as determined through posterior probability) was obtained anddesignated Anc80L1 (SEQ ID NO:35 shows the nucleic acid sequence of theAnc80L1 capsid and SEQ ID NO:36 shows the amino acid sequence of theAnc80L1 VP1 polypeptide). The Anc80 probabilistic library also wassynthesized using the sequences described herein by a commercial companyand sub-cloned into expression vectors.

The Anc80 library was clonally evaluated for vector yield andinfectivity in combined assays. Out of this screening, Anc80L65 (SEQ IDNO:23), as well as several other variants, were further characterized.

The Anc80 library and Anc80L65 were compared in terms of sequencedifference (FIG. 8; % up from diagonal, # of amino acid differencesbelow). Using NCBI-BLAST, the closest publically available sequence toAnc80L65 is rh10 (GenBank Accession No. AA088201.1).

FIG. 9 shows that Anc80L65 produced vector yields equivalent to AAV2(Panel A), generated virus particles under Transmission Electroscopy(TEM) (Panel B), and biochemically produced the AAV cap and the VP1, 2and 3 proteins based on SDS page under denaturing conditions (Panel C)and Western Blotting using the AAV capsid antibody, B1 (Panel D). Theseexperiments are described in more detail in the following paragraphs.

Briefly, AAV2/8, AAV2/2, AAV2/Anc80L27, AAV2/Anc80L44, and AAV2/Anc80L65vectors were produced in small scale containing a reporter constructcomprised of eGFP and firefly luciferase under a CMV promoter wereproduced in small scale. Titers of these small scale preparations ofviruses were then obtained via qPCR. Based on these experiments,Anc80L27, Anc80L44, and Anc80L65 vectors were found to produce virallevels comparable to that of AAV2 (FIG. 9A).

To confirm that the Anc80L65 capsid proteins assembled into intactvirus-like particles of the proper size and conformation, micrographswere obtained using transmission electron microscopy (TEM). A largescale, purified preparation of Anc80-L065 was loaded onto polyvinylformal (FORMVAR®) coated copper grids and was then stained with uranylacetate. Micrographs revealed intact, hexagonal particles with diametersbetween 20 and 25 nm (FIG. 9B).

In order to determine whether the synthetic ancestral capsid genes wereproperly processed (i.e. spliced and expressed), large-scale purifiedpreparations of AAV2/8, AAV2/2, and AAV2/Anc80L65 vectors were loadedonto an SDS-PAGE gel (1E10 GC/well) under denaturing conditions. Bandsrepresenting viral capsid proteins VP1, VP2, and VP3 were clearlypresent for each vector preparation (FIG. 9C). Western blotting with theAAV capsid antibody B1 further confirmed that these bands representedthe predicted proteins (FIG. 9D).

In addition, FIG. 10 shows that Anc80L65 infected mammalian tissue andcells in vitro on HEK293 cells at MOI 10E4 GC/cell using GFP as readout(Panel A) or luciferase (Panel B) versus AAV2 and/or AAV8 controls.Anc80L65 also was efficient at targeting liver following an IV injectionof the indicated AAV encoding a nuclear LacZ transgene (top row, PanelC), following direct intramuscular (IM) injection of the indicated AAVencoding GFP (middle row, Panel C), and following subretinal injectionwith the indicated AAV encoding GFP (bottom row, Panel C). Theseexperiments are described in more detail in the following paragraphs.

To obtain a relative measure of the infectivity of ancestral virions,crude preparations of AAV2/2, AAV2/8, AAV2/Anc80L65, AAV2/Anc80L44,AAV2/Anc80L27, AAV2/Anc80L121, AAV2/Anc80L122, AAV2/Anc80L123,AAV2/Anc80L124, and AAV2/Anc80L125 containing a bi-cistronic reporterconstruct that includes an eGFP and firefly luciferase sequences undercontrol of a CMV promoter were produced. 96-well plates confluent withHEK293 cells were then subjected to transduction with each vector at anMOI of 1E4 GC/cell (titers obtained via qPCR as above). 48 hours later,fluorescent microscopy confirmed the presence of GFP in transduced cells(FIG. 10A). Cells were then assayed for the presence of luciferase (FIG.10B), which determined that expression of luciferase in cells transducedwith Anc80-derived vectors was in-between that of cells transduced withAAV8 (lower level of transduction) and AAV2 (higher level oftransduction).

To assess the relative efficiency of gene transfer in an in vivocontext, purified high-titer preparations of AAV2/2, AAV2/8, andAAV2/Anc80L65 were obtained. 3.9E10 GC of each vector, encapsidating atransgene encoding nuclear LacZ under control of a TBG promoter, wereinjected into C57BL/6 mice (3 mice per condition) via IP injectionfollowing general anesthetization. 28 days post-injection, mice weresacrificed and tissues were collected. Livers were sectioned viastandard histological techniques and stained for beta-galactosidase.Sections were then imaged under a microscope and representative imagesare shown in FIG. 10C, top row.

Vectors of the same serotypes were then obtained containing abicistronic transgene encoding eGFP and hA1AT under control of a pCASIpromoter. To assess the ability of Anc80L65 to transduce murine skeletalmuscle, 1E10 GC of each vector was injected into skeletal muscle ofC57BL/6 mice (5 mice per condition) following general anesthetization.28 days post-injection, mice were sacrificed, tissues werecryosectioned, and the presence of eGFP was assessed using fluorescentconfocal microscopy (blue is DAPI, green is eGFP). Representative imagesare shown in FIG. 10C, middle row. These experiments demonstrated thatAnc80L65 vectors were capable of transducing murine skeletal muscle viaintramuscular injection.

Vectors of the same serotypes were obtained, this time encapsidatingconstructs encoding only an eGFP transgene under control of a CMVpromoter. 2E9 particles were injected sub-retinally into C57BL/6 micefollowing general anesthetization. 28 days post-injection, mice weresacrificed and the eyes were collected, cryosectioned, and the presenceof eGFP was assessed using fluorescent confocal microscopy (blue isDAPI, green is eGFP). Representative images are shown in FIG. 10C,bottom row. These experiments demonstrate that Anc80L65 vectors are ableto transduce murine retina at a level that is comparable to AAV8vectors.

Briefly, purified, high titer preparations of AAV2/8, AAV2/2,AAV2/rh32.33, and AAV2/Anc80L65 viral vectors encapsidating abicistronic transgene that includes eGFP and firefly luciferase undercontrol of a CMV promoter were obtained. These vectors were then eitherincubated with two-fold serial dilutions of IVIG (10 mg, 5 mg, 2.5 mg,etc.) or incubated without IVIG (1E9 GC per condition). Followingincubation, vectors were used to transduce HEK293 cells at an MOI of 1E4per well (one dilution per well).

Example 8 Generation of Additional Ancestral AAV Capsids

The most probable ancestral AAV capsid sequences (as determined throughposterior probability) were then synthesized through a commercial lab(Gen9) and provided as linear dsDNA. These amino acid sequences werethen compared to those of extant AAVs in order to ascertain the degreeto which they differ (FIG. 11). Each ancestral VP1 protein differs fromthose of selected representative extant AAVs by between 3.6% and 9.3%(FIG. 11A), while the ancestral VP3 proteins differ by between 4.2 and9.4% (FIG. 11B). These capsids were each subcloned into AAV productionplasmids (pAAVector2/Empty) via restriction enzyme digestion (HindIII &SpeI) and T4 ligation. These clones were confirmed via restrictiondigestion and Sanger sequencing, and medium scale preparations ofplasmid DNA were then produced.

Each of these plasmids were then used to produce AAV vectors containinga reporter gene encoding both eGFP and firefly luciferase. These vectorswere produced in triplicate in small scale as previously described.Crude preparations of the virus were then titered via qPCR and werefound to produce between 2.71% and 183.1% viral particles relative toAAV8 (FIGS. 12 and 13). These titers were then used to set up a titercontrolled experiment to assess relative infectivity. Anc126 was nottiter controlled due to its significantly depressed production, andconsequently, the data regarding the infectivity of Anc126 cannot beaccurately compared to the infectivity of the other viruses in theexperiment. The other vectors were used to transduce HEK293 cells at amultiplicity of infection (MOI) of 1.9E3 GC/cell.

60 hours post transduction, cells were assessed for GFP expression viafluorescence microscopy. eGFP positive cells were detected under each ofthe conditions except for the negative control (FIG. 14). This indicatesthat each of the ancestral sequences that were predicted, synthesized,and cloned is capable of producing viable, infectious virus particles.To get an idea of the relative levels of infectivity, luciferase assaysalso were performed on the same cells. The results indicate that each ofthe ancestral vectors is capable of transducing HEK293 cells between28.3% and 850.8% relative to AAV8 (FIGS. 15 and 16). It is noted thatAnc126 was excluded from the analysis of relative transduction since itwas not titer-controlled.

In summary, eight novel ancestral AAV capsid genes were synthesized andused in the production of functional viral vectors along with AAV8,AAV2, and the previously described Anc80L65 vectors. Production andinfectivity were assessed in vitro and a summary of those findings isshown in FIG. 17.

Example 9 Vectored Immunoprophylaxis

In vectored immunoprophylaxis, gene therapy vehicles (such as AAV) areused to deliver transgenes encoding broadly neutralizing antibodiesagainst infectious agents. See, for example, Balazs et al. (2013, Nat.Biotechnol., 31:647-52); Limberis et al. (2013, Sci. Transl. Med.,5:187ra72); Balazs et al. (2012, Nature, 481:81-4); and Deal et al.(2014, PNAS USA, 111:12528-32). One advantage of this treatment is thatthe host produces the antibodies in their own cells, meaning that asingle administration has the potential to confer a lifetime ofprotection against etiologic agents.

Example 10 Drug Delivery Vehicles

LUCENTIS® (ranibizumab) and AVASTIN® (bevacizumab) are bothanti-angiogenesis agents based on the same humanized mouse monoclonalantibodies against vascular endothelial growth factor A (VEGF-A).Although bevacizumab is a full antibody and ranibizumab is a fragment(Fab), they both act to treat wet age-related macular degenerationthrough the same mechanism—by antagonizing VEGF. See, for example, Maoet al. (2011, Hum. Gene Ther., 22:1525-35); Xie et al. (2014, Gynecol.Oncol., doi: 10.1016/j.ygyno.2014.07.105); and Watanabe et al. (2010,Gene Ther., 17:1042-51). Because both of these molecules are proteins,they can be encoded by DNA and produced in cells transduced with vectorscontaining a transgene, and are small enough to be packaged into AAVvectors.

Other Embodiments

It is to be understood that, while the methods and compositions ofmatter have been described herein in conjunction with a number ofdifferent aspects, the foregoing description of the various aspects isintended to illustrate and not limit the scope of the methods andcompositions of matter. Other aspects, advantages, and modifications arewithin the scope of the following claims.

Disclosed are methods and compositions that can be used for, can be usedin conjunction with, can be used in preparation for, or are products ofthe disclosed methods and compositions. These and other materials aredisclosed herein, and it is understood that combinations, subsets,interactions, groups, etc. of these methods and compositions aredisclosed. That is, while specific reference to each various individualand collective combinations and permutations of these compositions andmethods may not be explicitly disclosed, each is specificallycontemplated and described herein. For example, if a particularcomposition of matter or a particular method is disclosed and discussedand a number of compositions or methods are discussed, each and everycombination and permutation of the compositions and the methods arespecifically contemplated unless specifically indicated to the contrary.Likewise, any subset or combination of these is also specificallycontemplated and disclosed.

What is claimed is:
 1. An adeno-associated virus (AAV) capsidpolypeptide having the amino acid sequence shown in SEQ ID NO:
 5. 2. TheAAV capsid polypeptide of claim 1, wherein the AAV capsid polypeptide ora virus particle comprising the AAV capsid polypeptide: exhibits a lowerseroprevalence than does an AAV2 capsid polypeptide or a virus particlecomprising an AAV2 capsid polypeptide, and wherein the AAV capsidpolypeptide or a virus particle comprising the AAV capsid polypeptideexhibits about the same or a lower seroprevalence than does an AAV8capsid polypeptide or a virus particle comprising an AAV8 capsidpolypeptide; and/or is neutralized to a lesser extent by human serumthan is an AAV2 capsid polypeptide or a virus particle comprising anAAV2 capsid polypeptide, and wherein the AAV capsid polypeptide or avirus particle comprising the AAV capsid polypeptide is neutralized to asimilar or lesser extent by human serum than is an AAV8 capsidpolypeptide or a virus particle comprising an AAV8 capsid polypeptide.3. The AAV capsid polypeptide of claim 1, wherein the AAV capsidpolypeptide is purified.
 4. The AAV capsid polypeptide of claim 1,encoded by the nucleic acid sequence shown in SEQ ID NO:
 6. 5. A nucleicacid molecule encoding an adeno-associated virus (AAV) capsidpolypeptide having the nucleic acid sequence shown in SEQ ID NO:
 6. 6. Avector comprising the nucleic acid molecule of claim
 5. 7. An isolatedhost cell comprising the vector of claim
 6. 8. A purified virus particlecomprising the AAV capsid polypeptide of claim
 1. 9. The purified virusparticle of claim 8, further comprising a transgene.